Integrating the Human Recommendations in the Decision Process of Autonomous Agents: A Goal Biased Markov Decision Process
نویسندگان
چکیده
In this paper, we address the problem of computing the policy of an autonomous agent, taking human recommendations into account which could be appropriate for mixed initiative, or adjustable autonomy. For this purpose, we present Goal Biased Markov Decision Process (GBMDP) which assume two kinds of recommendation. The human recommends to the agent to avoid some situations (represented by undesirable states), or he recommends favorable situations represented by desirable states. The agent takes those recommendations into account by updating its policy (only updating the states concerned by the recommendations, not the whole policy). We show that GBMDP is efficient and it improves the human’s intervention by reducing its time of attention paid to the agent. Moreover, GBMDP optimizes robot’s computation time by updating only the necessary states. We also show how GBMDP can consider more than one recommendation. Finally, our experiments show how we update policies which are intractable by standard approaches.
منابع مشابه
Optimizing Red Blood Cells Consumption Using Markov Decision Process
In healthcare systems, one of the important actions is related to perishable products such as red blood cells (RBCs) units that its consumption management in different periods can contribute greatly to the optimality of the system. In this paper, main goal is to enhance the ability of medical community to organize the RBCs units’ consumption in way to deliver the unit order timely with a focus ...
متن کاملInterval-Based Markov Decision Processes for Regulating Interactions Between Two Agents in Multi-agent Systems
This work presents a model for Markov Decision Processes applied to the problem of keeping two agents in equilibrium with respect to the values they exchange when they interact. Interval mathematics is used to model the qualitative values involved in interactions. The optimal policy is constrained by the adopted model of social interactions. The MDP is assigned to a supervisor, that monitors th...
متن کاملApplication of multi-criteria decision making to estimate the potential of flooding
Integrating a geographic information system and multi-criteria decision making methods have been lead to provide spatial multi-criteria decision making methods. In this study, the spatial potential of flooding was determined based on analytic network process and analytic hierarchy process. At first, six factors of flooding were determined as criteria. The criteria were the slope, hill-slope asp...
متن کاملA new machine replacement policy based on number of defective items and Markov chains
A novel optimal single machine replacement policy using a single as well as a two-stage decision making process is proposed based on the quality of items produced. In a stage of this policy, if the number of defective items in a sample of produced items is more than an upper threshold, the machine is replaced. However, the machine is not replaced if the number of defective items is less than ...
متن کاملAccelerated decomposition techniques for large discounted Markov decision processes
Many hierarchical techniques to solve large Markov decision processes (MDPs) are based on the partition of the state space into strongly connected components (SCCs) that can be classified into some levels. In each level, smaller problems named restricted MDPs are solved, and then these partial solutions are combined to obtain the global solution. In this paper, we first propose a novel algorith...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011